
## Overview

This dataset contains 4 files of historical performance benchmark results from the SPEC1995, SPEC2000, SPEC2006 and SPEC2017 suite (Standard Performance Evaluation Corporation) respectively. It includes detailed hardware specifications, system configurations, and performance scores for various computer systems tested by each of the SPEC suite. The dataset can be used for analyzing the historical progression of CPU performance, studying the relationship between hardware specifications (cache, clock speed) and benchmark scores or other related studies.

### File Information

* Filename: `df1995.csv`: includes systems and the performance scores tested by SPEC1995. More detailed information regarding SPEC1995 can be found at https://www.spec.org/cpu95/qanda.html.
* Filename: `df2000.csv`: includes systems and the performance scores tested by SPEC2000. More information is available at https://www.spec.org/cpu2000/docs/.
* Filename: `df2006.csv`: includes systems and the performance scores tested by SPEC2006. More information is available at https://www.spec.org/cpu2006/docs/.
* Filename: `df2017.csv`: includes systems and the performance scores tested by SPEC2017. More information is available at https://www.spec.org/cpu2017/docs/.

## Data Dictionary

### 1. Benchmark Scores

All SPEC suites include both SPECspeed and SPECrate metrics. "SPECspeed" is a time-based metric and "SPECrate" is a throughput metric. Similarly, all suites include both "base" and "peak" metrics. The "base" metrics indicate strict adherence to standard compilation rules, while "peak" allows for aggressive optimizations. More detailed description can be found here:https://spec.org/cpu2017/docs/overview.html#Q17.

#### Key Metric Columns:

* `base_floating_point_speed` / `peak_floating_point_speed`: Overall floating-point time-based performance metric.
* `base_integer_speed` / `peak_integer_speed`: Overall integer time-based performance metric.

* `base_floating_point_rate` / `peak_floating_point_rate`: Overall floating-point throughput performance metric.
* `base_integer_rate` / `peak_integer_rate`: Overall integer throughput performance metric.

#### Micro-benchmarks:

The specific component benchmarks (microbenchmarks) used to calculate overall metrics vary between suites. We strongly recommend consulting the SPEC website for detailed descriptions of each test. The list below outlines the micro-benchmarks used specifically in the SPEC95 suite.

**Individual Benchmarks (Integer):**

* `go_099`: Artificial intelligence (Go game)
* `m88ksim_124`: Moto 88K Chip Simulator
* `gcc_126`: C Compiler
* `compress_129`: Data compression
* `li_130`: LISP interpreter
* `ijpeg_132`: Image compression
* `perl_134`: Perl interpreter
* `vortex_147`: Object-oriented database

**Individual Benchmarks (Floating Point):**

* `tomcatv_101`: Mesh generation
* `swim_102`: Shallow water equation
* `su2cor_103`: Quantum physics
* `hydro2d_104`: Astrophysics
* `mgrid_107`: Multi-grid solver
* `applu_110`: Parabolic / Elliptic PDEs
* `turb3d_125`: Turbulence modeling
* `apsi_141`: Weather prediction
* `fpppp_145`: Chemistry / Quantum
* `wave5_146`: Plasma physics

*(Note: Columns appear as `base_[benchmark]` and `peak_[benchmark]`)*

### 2. System Hardware

* `manufacturer`: The company that produced the system (e.g., Sun, Intel, SGI).
* `system_model`: The model name of the computer.
* `accepted_memory`: RAM installed in the test system.
* `disk_subsystem`: Hard drive specifications.
* `file_system`: The file system used (e.g., UFS, NTFS).
* `l1_cache_unspecified`, `l2_...`, `l3_...`: Cache sizes (if explicitly listed in the top-level summary).

### 3. Processor Details

* `processors`: A **JSON-formatted string** containing detailed breakdown of the CPUs, including count, individual core specs, and cache hierarchies.
* `cpu_name`: The marketing name of the processor.
* `clock_speed`: CPU frequency (in MHz).
* `codename`: Internal architecture codename.
* `tdp`: Thermal Design Power.
* `die_size`: Physical size of the chip.
* `transistors`: Transistor count (in millions).
* `technology`: Fabrication process size.

### 4. Software & Environment

* `operating_system` / `kernel`: OS version.
* `compiler`: Compiler version and flags used.
* `date`: The date the benchmark was published/measured (Format: YYYY-MM-DD).

## Usage Notes

1. **Missing Values:** Many columns contain `NaN` (null) values. This is expected as not every system reports every specific cache parameter or runs every optional "Peak" benchmark.
2. **Data Types:** Most metrics are floats. Hardware specs are often strings that may require cleaning (e.g., "1 GB sdram" vs "512mb").

## Source

Data extracted from public SPEC benchmark websites, maintained by the Standard Performance Evaluation Corporation (SPEC). Original result pages are linked in the `information_source` column.